- 
                Notifications
    You must be signed in to change notification settings 
- Fork 994
Tests for causal prediction #1321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests for causal prediction #1321
Conversation
5df95b8    to
    58a2504      
    Compare
  
    58a2504    to
    424755d      
    Compare
  
    Signed-off-by: maartenvanhooft <[email protected]>
424755d    to
    17a97cb      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding this PR, @maartenvanhooftds . The tests make sense, but I'm wondering if we can have a stronger test that compares CACM and ERM.
How about the following property: difference in accuracy between a test dataset from the same distribution as train, and the main test dataset? That would be higher for ERM and we can check as a comparison assert. Can you add this for your setup?
| Great feedback, thanks! Will implement it later this week. Edit: Sorry, it has been taking a bit longer, just started a new job. It's still on my mind though. | 
Signed-off-by: maartenvanhooft <[email protected]>
2fec8f0    to
    f228a4e      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the changes, @maartenvanhooftds . The PR looks good now.
| @all-contributors please add @maartenvanhooftds for code | 
| I've put up a pull request to add @maartenvanhooftds! 🎉 | 
Signed-off-by: Amit Sharma <[email protected]>
| @maartenvanhooftds Just realized that sometimes the test does not succeed. If you look at the CI build above, the result is around  | 
| Thanks for your patience @amit-sharma If you ask me, two things are not going right: 
 To get some insights in 1), I've re-ran without seeds for 100 runs to get an insight in the accuracy distribution of CACM for val and test split. I've found that even when playing with higher signal (beta) on the dataloaders, I don't always get sufficient accuraccy, occassionaly there just are some outliers. So all in all the results are good, but some poor outliers. I think in general, the nicest way to solve this is by seeding. Upper code passes on my machine 😄 So I must be missing something for setting the seed. Do you have an idea on either: | 
| thanks for looking into this, @maartenvanhooftds . can you try adding  Otherwise it may be a version mismatch in py3.11 (the test that is failing). There's not too much difference between the github CI env and a local installation, except the exact versions of the packages installed. you may want to check with py3.11 and the packages installed here in the log here | 
| This PR is stale because it has been open for 60 days with no activity. | 
| This PR was closed because it has been inactive for 7 days since being marked as stale. | 
Inspired by Issue 1313.
Changes:
First contribution here, please review critically for any mistakes or inconsistencies!